20 research outputs found
Depth First Exploration of a Configuration Model
We introduce an algorithm that constructs a random uniform graph with
prescribed degree sequence together with a depth first exploration of it. In
the so-called supercritical regime where the graph contains a giant component,
we prove that the renormalized contour process of the Depth First Search Tree
has a deterministic limiting profile that we identify. The proof goes through a
detailed analysis of the evolution of the empirical degree distribution of
unexplored vertices. This evolution is driven by an infinite system of
differential equations which has a unique and explicit solution. As a
byproduct, we deduce the existence of a macroscopic simple path and get a lower
bound on its length.Comment: 30 page
A Novel Information-Theoretic Objective to Disentangle Representations for Fair Classification
One of the pursued objectives of deep learning is to provide tools that learn
abstract representations of reality from the observation of multiple contextual
situations. More precisely, one wishes to extract disentangled representations
which are (i) low dimensional and (ii) whose components are independent and
correspond to concepts capturing the essence of the objects under consideration
(Locatello et al., 2019b). One step towards this ambitious project consists in
learning disentangled representations with respect to a predefined (sensitive)
attribute, e.g., the gender or age of the writer. Perhaps one of the main
application for such disentangled representations is fair classification.
Existing methods extract the last layer of a neural network trained with a loss
that is composed of a cross-entropy objective and a disentanglement
regularizer. In this work, we adopt an information-theoretic view of this
problem which motivates a novel family of regularizers that minimizes the
mutual information between the latent representation and the sensitive
attribute conditional to the target. The resulting set of losses, called
CLINIC, is parameter free and thus, it is easier and faster to train. CLINIC
losses are studied through extensive numerical experiments by training over 2k
neural networks. We demonstrate that our methods offer a better
disentanglement/accuracy trade-off than previous techniques, and generalize
better than training with cross-entropy loss solely provided that the
disentanglement task is not too constraining.Comment: Findings AACL 202
The Glass Ceiling of Automatic Evaluation in Natural Language Generation
Automatic evaluation metrics capable of replacing human judgments are
critical to allowing fast development of new methods. Thus, numerous research
efforts have focused on crafting such metrics. In this work, we take a step
back and analyze recent progress by comparing the body of existing automatic
metrics and human metrics altogether. As metrics are used based on how they
rank systems, we compare metrics in the space of system rankings. Our extensive
statistical analysis reveals surprising findings: automatic metrics -- old and
new -- are much more similar to each other than to humans. Automatic metrics
are not complementary and rank systems similarly. Strikingly, human metrics
predict each other much better than the combination of all automatic metrics
used to predict a human metric. It is surprising because human metrics are
often designed to be independent, to capture different aspects of quality, e.g.
content fidelity or readability. We provide a discussion of these findings and
recommendations for future work in the field of evaluation
Online Matching in Geometric Random Graphs
We investigate online maximum cardinality matching, a central problem in ad
allocation. In this problem, users are revealed sequentially, and each new user
can be paired with any previously unmatched campaign that it is compatible
with. Despite the limited theoretical guarantees, the greedy algorithm, which
matches incoming users with any available campaign, exhibits outstanding
performance in practice. Some theoretical support for this practical success
was established in specific classes of graphs, where the connections between
different vertices lack strong correlations - an assumption not always valid.
To bridge this gap, we focus on the following model: both users and campaigns
are represented as points uniformly distributed in the interval , and a
user is eligible to be paired with a campaign if they are similar enough, i.e.
the distance between their respective points is less than , with a
model parameter. As a benchmark, we determine the size of the optimal offline
matching in these bipartite random geometric graphs. In the online setting and
investigate the number of matches made by the online algorithm closest, which
greedily pairs incoming points with their nearest available neighbors. We
demonstrate that the algorithm's performance can be compared to its fluid
limit, which is characterized as the solution to a specific partial
differential equation (PDE). From this PDE solution, we can compute the
competitive ratio of closest, and our computations reveal that it remains
significantly better than its worst-case guarantee. This model turns out to be
related to the online minimum cost matching problem, and we can extend the
results to refine certain findings in that area of research. Specifically, we
determine the exact asymptotic cost of closest in the -excess regime,
providing a more accurate estimate than the previously known loose upper bound
A Functional Data Perspective and Baseline On Multi-Layer Out-of-Distribution Detection
A key feature of out-of-distribution (OOD) detection is to exploit a trained
neural network by extracting statistical patterns and relationships through the
multi-layer classifier to detect shifts in the expected input data
distribution. Despite achieving solid results, several state-of-the-art methods
rely on the penultimate or last layer outputs only, leaving behind valuable
information for OOD detection. Methods that explore the multiple layers either
require a special architecture or a supervised objective to do so. This work
adopts an original approach based on a functional view of the network that
exploits the sample's trajectories through the various layers and their
statistical dependencies. It goes beyond multivariate features aggregation and
introduces a baseline rooted in functional anomaly detection. In this new
framework, OOD detection translates into detecting samples whose trajectories
differ from the typical behavior characterized by the training set. We validate
our method and empirically demonstrate its effectiveness in OOD detection
compared to strong state-of-the-art baselines on computer vision benchmarks